literature_pre = await FileAttachment("data/Literature.csv").csv()
benchmark_datasets_pre = await FileAttachment("data/Benchmark_datasets.csv").csv()
literature = literature_pre.filter(d => !d.Name.includes("EXCLUDE") && !d.Name == "")
benchmark_datasets = benchmark_datasets_pre.filter(l => l.Name != "")
paper_sources_pre = await FileAttachment("data/Paper Sources.csv").csv()
paper_source = paper_sources_pre.filter(l => l.Name != "")Benchmark Datasets for Graph Layout Algorithms
1 Introduction
Benchmarking is a crucial aspect of computer science, as it allows researchers, developers, and engineers to compare the performance of various systems, algorithms, or hardware. A benchmark is a standardized test or set of tests used to measure and compare the performance of hardware, software, or systems under specific conditions. Benchmarking aims to provide objective and consistent metrics that allow for fair comparisons and informed decision-making. Benchmarks are widely used in various fields, including computer hardware evaluation, software optimization, and system performance analysis. In all these fields, benchmarking provides a standardized and objective way to compare and assess the performance of different systems, algorithms, or software implementations. It aids in making informed decisions about which solution best suits a specific use case or requirement.
The same is true for graph drawing, particularly for studying the performance and results of graph layout algorithms (Di Bartolomeo et al. 2024). Benchmark datasets can provide a standardized set of graphs with known properties and characteristics. These graphs can vary in size, density, connectivity, and structure. Researchers can objectively compare their performance or the quality of their results by applying various graph layout algorithms to the same benchmark dataset.
In our own work, we have faced challenges in determining which benchmark datasets to use for evaluating the layout algorithms we developed. This led us to build a collection of benchmark datasets used in previous graph layout algorithm papers and a Graph Benchmark Datasets website for perusing the collection. We collected 196 papers from Graph Drawing, IEEE venues, and Eurographics venues that include computational evaluations of graph layout algorithms. We then searched for the datasets used for the benchmarks. We collected the data we could find and had permission to archive and re-created datasets that were lost but had sufficient replication instructions. We classified graphs by their features and statistics. We also found text and images from papers using those graphs.
This paper aims to present this graph drawing benchmark sets resource to the Graph Drawing and visualization communities so that other authors may benefit from our archiving and organization efforts. We hope this resource will encourage the discoverability of these datasets and the ease of running benchmarks for graph layout algorithms. Moreover, as reliable access to datasets is fundamental for replicability, we aim to preserve these datasets in perpetuity. Beyond collecting available datasets and re-creating lost ones, we archived all our materials on OSF for long-term availability. This included saving each graph in multiple common file formats to avoid translation issues for individual authors. We believe our work will lead to more reproducible and replicable Graph Drawing research by providing a long-term and open archive of the data we use in our computational evaluations.
Specifically, we contribute:
- A collection of benchmark datasets used for graph layout algorithm evaluations, including:
- Re-creating lost datasets based on paper descriptions and a list of unavailable data,
- A classification of these datasets by graph features and statistics to aid in identifying appropriate evaluation candidates,
- Exemplar text and images from papers that use these datasets,
- A website for perusing this collection, and
- A long-term archive of our metadata and the collection to aid in reproducibility and replicability of evaluations.
Please see Section 5 below for our supplemental materials, including links to the website, code and data, OSF archive, and Notion database.
2 Collection process
The information we collected is a by-product of a larger systematic review we conducted related to graph layout algorithms, which included 196 papers The following figure shows the original data collection process (from (Di Bartolomeo et al. 2024)):
The core of our data collection was the last seven years of Graph Drawing proceedings (264 papers in total), filtering out papers without computational evaluations. We further expanded our graph layout algorithm papers collection by searching IEEE Xplore and Wiley digital libraries to include papers from TVCG and CGF. Then, we checked all the citations in the papers we collected from Graph Drawing, and added to our collection all the papers that were cited more than 4 times in the last 6 years of graph drawing — to make sure we included algorithm papers that were important, but not published at GD, on IEEE Xplore or on the Wiley digital library. For each paper, we collected which features were handled by the graph layout algorithm presented, and what dataset was used in the evaluation. When collecting features, we always prioritized the authors’ own wording and description of the features. The tagging of the papers was done by two people at the same time, over two different passes for sanity-checking purposes. Following this process, we tried to track down the datasets used in computational evaluations: (1) we first looked for official or linked supplemental material, (2) we next Googled the dataset or paper name, (3) finally, we emailed the authors. When in doubt about the artifact replication policy, we asked the owners or authors by email. In cases where it was explicitly mentioned that approval should be received before redistribution, we did not redistribute the datasets. However, if we received approval or did not receive an answer and found no explicit policy preventing redistirbution, we collected and stored the dataset to preserve it for future researchers. If any dataset owner or author discovers their own work in our collection and would like it removed, we kindly request that they contact us (see Section 7 below), and we will promptly remove it. Furthermore, we want to emphasize that we do not assert any ownership rights over the datasets listed.
The chart below shows the distribution of papers across different venues:
The following one, instead, shows the distribution of collected papers’ publication date:
After collecting the datasets, we looked more in-depth into their contents, running analysis on a number of statistics associated with the graphs contained in them.
We finally ended up collecting 46 datasets, listed below:
The purpose of our collection effort is to facilitate comparisons and replication of results, and to provide a resource for researchers to find datasets that are used in the literature. We also provide links to the specific papers that use each dataset, so that they can be used as examples and sources of information.
3 Datasets in use
The following chart shows how many times we found a dataset being used in the papers we collected. It excludes custom edits to the datasets, which are discussed later in this document.
In the data we collected, the most used dataset is Rome-Lib, followed by assorted collaboration networks (which in many cases refers to datasets of academic collaborations such as dblp or vispubdata). The third most used dataset is from C.Walshaw - it is important to note that the Walshaw dataset is available as part of other collections - for instance, its graphs are found in SNAP. However, during the collection process, we preferred giving precedence to how the authors reported their own information. Thus, if the authors claimed the data was from the Walshaw collection, we reported it as such.
In the following sections, the reader will find details about the classifications and datasets in detail. Each dataset gets a dedicated, collapsible section, that contains the following information: - A sparkline visualization is shown in case the dataset contains multiple graphs, illustrating the distribution of graph sizes (in nodes) in the graphs in the dataset.
3.1 Classification of the Datasets
We classified the datasets in different categories, based on their origins or amount of graphs contained in them:
Uniform Benchmark datasets
Uniform Benchmark datasets are standalone widely used collections of graphs featuring uniform characteristics - usually simple, generic graphs, often used in evaluations that run over thousands of graphs to report average completion times, or other experiments where the reported metrics are usually aggregated.
The first of these collapsible sections is shown already expanded, to give an example of the content that can be found in each of them. The content is generated dynamically based on the data we collected.
The following collections, together with Rome-Lib, can be easily accessed from the homepage of the Graph Drawing Conference website, and are therefore well known and widely used.
The following collections are particularly varied in features:
The following two collections are particularly relevant for the display of temporal event sequences:
The datasets listed below are particularly useful when researchers are interested in testing for crossing numbers. Indeed, they both deal with graphs that have a known crossing numbers.
Collections of graphs that focus on one specific attribute can also be useful in case the attribute relates to geographical data:
And finally, a collection of graph problems:
Established Network Repositories
A popular choice is to use datasets from Established Network Repositories. These are ample collections, often organized in dedicated websites which also offer a few stats about the contained graphs. Because their hosts are already dedicated to the maintaining and reporting of information on these collections, we do not include here any storage of the data (which would be redundant) or report statistics on them. Rather, our analysis here is focused on highlighting their properties, origins, and ways in which they have been used.
Compiled by the Mathematical and Computational Sciences Division of the Information Technology Laboratory of the National Institute of Standards and Technology, the Matrix Market is a repository of test data for use in comparative studies of algorithms for numerical linear algebra, featuring nearly 500 sparse matrices from a variety of applications, as well as matrix generation tools and services. It has been used for experiments with generic graphs, large graphs, and graphs with weighted edges. Each matrix entry is accompanied by metadata that includes the matrix name and identifier, dimensions (number of rows and columns), number of non-zero elements, symmetry type (symmetric, skew-symmetric, or general) and data type (real, complex, integer, or pattern). The website also provides visualizations of the matrices, helping users understand their structure and distribution of non-zero elements. Downloads are also provided in a variety of formats, including their own Matrix Market Exchange (MME) format, Harwell-Boeing, and MATLAB.
The Network Repository was proposed in 2015 by Rossi and Ahmed of Purdue University as a visually interactive real world graph tool and database. Graphs in this repository are tagged with their real-world source, have in-depth analysis, and even a preview visualization of each individual graph. Types of graphs include social networks, infrastructure networks, biological networks, and many others. The repository also offers sources for individual graphs. It contains many graphs from other benchmark sets described here.
The Pajek Program for Large Network Analysis is a tool developed and hosted by Andrej Vlado and some of their colleagues. As part of this program, they later compiled relevant graphs and links to other datasets, which we call today the Pajek collection. As a curiosity, pajek means spider in Slovenian. Many of Pajek graphs have been included as part of the SuiteSparse Matrix Collection.
The SNAP repository is a collection of datasets assembled as part of the Stanford Network Analysis Platform, which started in 2004 and launched their website in 2009. Well-known, widely used graph repository. A number of graphs that became relevant individually are included in SNAP, such as Enron, Amazon, Protein Interactions datasets and various Social Network graphs. SNAP contains generic graphs, directed and undirected graphs, dynamic graphs and more. Graphs are presented with name and descriptions and a few statistics such as a general description, numbers of nodes and edges, source and reference information.
Compiled by Donald Knuth, the Stanford Graphbase consists of programs and datasets for network analysis. The datasets accompany a textbook, “The Stanford GraphBase: A Platform for Combinatorial Computing”.
The SuiteSparse Matrix Collection, formerly known as the University of Florida Sparse Matrix Collection, is a comprehensive repository of 2893 sparse matrices. All graphs in SuiteSparse belong to groups which will have more information about the graphs and the sub-collections they belong to. In our Descriptions from the Literature section we also highlight a few tables with the specific graphs used in a couple of papers. From “The university of Florida Sparse Matrix Collection”, Davis and Hu describe the origin of this network repository. Namely they cite the Harwell-Boeing collection as the starting point for SuiteSparse, then called the University of Florida (UF) Sparse matrix collection, back in 1991. Other groups, or collections, have then been added to SuitseSparse through the years, mainly focusing on real-world matrices and other relevant problems related to them.
Single Graphs
A number of papers used individual, Single Graphs for their experiments instead of a collection. These graphs have become popular because of their properties as individual graphs - see, for example, the Enron dataset below, a huge individual graph that has a large variance in node degree distribution. Many of these graphs are also available in other repositories - their locations are noted wherever known.
Aggregate Collections
Many papers used graphs from specific domains that contain particular characteristics (e.g., geographical coordinates are often found in airline data). Instead of collecting each one of these individual, contextual datasets, we aggregated them in further subcategories, and called them Aggregate collections. Individual information about about each aggregate collection can be found in the papers that contain them.
Subsets of other collections
Finally, we collected some datasets that we noticed being subsets of existing collections. This is a phenomenon that can happen through the years, through the redistribution and through the merging of different sources: the Walshaw dataset, for instance, was and still is distributed and cited as its own standalone dataset, but its graphs can be found as part of many other larger collections, such as SNAP. We classified these datasets as Subsets.
Custom-made datasets
In the data we collected, we also found several instances of custom-made datasets. We consider custom-made datasets either edits to pre-existing datasets, where the authors found it necessary to either split or modify the dataset in a particular way, or datasets completely made up from scratch using random generators or custom-made code. This can happen in cases where the authors of a paper needed a dataset containing particular characteristics which was not easy to find in the wild, so a new dataset was crafted.
For instance, consider the case where the authors of a paper develop an algorithm that works on hypergraphs. They want to test that the algorithm works, and test its performance on hypergraphs of various sizes, but datasets containing hypergraphs are difficult to find. For this reason, the authors craft one dataset synthetically, or take a pre-existing dataset and edit it so that it now contains hyperedges.
We split custom-made datasets in three categories, with their occurrences in the corpus of papers illustrated below:
Replicable datasets indicate cases where the authors have given enough information so that the experiment can be replicated exactly as it was run by the authors of a paper, or closely enough that the results obtained reflect the published ones very closely. This includes cases where either the authors published the entire dataset they used, they published the code they used to generate the dataset, or include an exact description of the steps they took to generate it.
Reproducible datasets are cases where the authors described the steps they took to generate and/or edit their datasets, but not in-depth enough so that the exact same graphs can be reproduced, and did not redistribute it. Results can still be reproduces somewhat closely if the authors took care to report enough information about their graphs.
For non-replicable datasets, we indicate cases where the authors did not distribute their datasets and did not include enough information in the paper so that their results could be replicated.
This information is closely tied to the distribution of supplemental material in papers, that is shown in the chart below:
This discussion is part of a larger discourse on research replicability, that is gaining traction in the scientific community. The ACM, for instance, has a policy on artifact review and badging, where authors are encouraged to submit their artifacts for review, and if they pass, they receive a badge that indicates the artifact is available for review. This is a step towards making research more replicable and reproducible, and we hope that our work will contribute to this effort.
See, e.g., ACM’s definitions at https://www.acm.org/publications/policies/artifact-review-and-badging-current.
3.4 Details of the datasets
For each of the datasets collected as part of our process, we conducted a brief analysis of their contents. Where possible, the analysis includes information about the number of nodes per graph, the source of the dataset, which papers have used the dataset and what graph features they took into account. The following section defaults to illustrating the contents of Rome-Lib, but the reader can easily change the dataset in focus by modifying the value of the dropdown menu below:
3.5 Random generation
We discussed above that in a few cases, authors have generated their own datasets to test their algorithms. We call these generated datasets synthetic or custom, as opposed to datasets found in the wild. Generating a synthetic dataset can be essential for several reasons, particularly in the context of algorithm development and validation:
- Privacy and Publication Constraints: Imagine you’ve created an innovative algorithm using a dataset that, due to privacy concerns, cannot be shared publicly. To validate and share your work without compromising data confidentiality, creating a synthetic dataset that mirrors the essential features of the original data allows you to showcase your algorithm’s effectiveness while adhering to privacy restrictions.
- Benchmarking and Comparative Analysis: Suppose you have access to a single dataset that can be shared and wish to prove your algorithm’s robustness across various scenarios. Generating synthetic datasets with comparable characteristics provides a controlled environment to conduct benchmark studies. This approach enables scientists to demonstrate that your algorithm performs consistently well across datasets with analogous features, thereby reinforcing its applicability and reliability.
- Addressing Data Availability Issues: In certain situations, you might design an algorithm to process graphs with specific attributes only to discover the absence of publicly available datasets showcasing these features. Synthetic datasets become invaluable here, allowing you to create tailored data that incorporates the necessary characteristics. This approach not only facilitates the testing of your algorithm in a relevant context but also helps in illustrating its potential in hypothetical yet plausible scenarios.
We use the word synthetic too in cases where the dataset has been altered, sliced, or other modifications have been applied to it.
The following is a practical example of a case where you would need to edit a network:
Imagine you are building a visualization to deal with the entirety of the Enron Corpus dataset , which has hundreds of thousands of nodes and edges. Because of its size, you decide to slice up the large dataset in many smaller files, so you can run tests. This particular dataset, in addition to being a challenging problem because of its size, also has an interesting distribution of connection densities: some nodes are extremely well connected, while others are much less connected. Indeed, the dataset is comprised of a collection of emails sent by Enron executives - between themselves or between other employees of the company. Because of how the dataset is constructed - where only the emails from the executives are taken into account - it has a distinctive skew in the connectedness of the data: 158 nodes are extremely well connected, while the rest of the nodes are much less connected. Because of this, there’s uncountable mistakes that can be done through slicing this graph: for instance, slicing it so that the subset includes a less dense section of the entire graph will fail to provide a representative section.
While the creation of a synthetic dataset is a perfectly viable way to produce a benchmark set, in addition to replicability criteria (as discussed above), we also have to pay particular attention to a number of statistics related to the structure of graphs — both when generating new datasets, and when slicing up existing datasets to reduce their size, or to create more graphs from a single, larger graph.
The list of features to take into account to claim that a synthetic graph is comparable to another one would be long, and perhaps out of the scope of this publication. These are just a few examples of what could be relevant:
- Size (nodes, edges): The total number of nodes and edges in the graph, indicating its overall scale.
- Diameter: The longest shortest path between any two nodes, showing the graph’s maximum extent.
- Density: The ratio of existing edges to possible edges, reflecting how closely knit the graph is.
- Motifs: Recurring, significant subgraphs; identify which motifs occur more or less frequently than expected.
- Connectedness: Evaluates both the number and sizes of graph components, along with detailed analysis per component.
- Centrality Measures (betweenness, closeness, degree): Quantify the importance of nodes based on their position and connections within the graph.
- Special Cases:
- Layers: Characteristics like number of layers, nodes per layer, and inter-layer connections.
- Disjoint Groups: The count and size distribution of separate clusters, plus an analysis of each group’s properties.
- Overlapping Sets: Number and size distribution of intersecting groups, along with detailed features.
- Additional Node/Edge Data: Information such as timestamps, attributes, or weights that add context to nodes/edges.
- Dynamic: Describes changes in the graph over time, including the nature and frequency of node/edge modifications.
The generation of random graphs that accurately mimic specific features presents a complex challenge, that has been explored abundantly in the past, but has found no universal solution yet. Two examples of popular models used to generate synthetic datasets are the Erdős-Rényi (ER) and the Barabási-Albert (BA) models, each one with their distinct focus and limitations:
The ER model excels in creating graphs with uniform edge distribution, ideal for testing an algorithm’s ability to evenly space nodes and reduce edge crossings. However, it falls short in replicating the complex, non-uniform connections seen in real-world networks, limiting its applicability to scenarios requiring simplicity and randomness. Conversely, the BA model produces scale-free networks with a few highly connected nodes, reflecting the hierarchical structure of many natural and human-made systems. It challenges layout algorithms to effectively display these hubs without cluttering. The limitation here is its focus on growth and preferential attachment, which might not suit all types of networks, particularly those without clear hub structures.
As there is no universal solution for random graph generation, our recommendation is to try as much as possible to pay attention to research replicability criteria, such as redistributing the generated dataset as supplemental material in the paper.
4 Conclusion
We presented a comprehensive benchmark dataset collection for graph layout algorithms. Compiling, organizing, and making a wide array of datasets with diverse characteristics accessible not only facilitates rigorous and fair comparisons of algorithmic performance but also addresses critical issues of replicability and reproducibility in research. The Graph Benchmark Datasets website, along with the efforts for long-term archival, ensures these valuable resources remain available to the community. Web aim to help both current and future researchers to benefit from a rich repository of data, standardizing the evaluation process for graph layout algorithms, fostering innovation and development within the field and contributes to the advancement of graph drawing and visualization research. With this work, we hope to underscore the importance of accessible, well-documented benchmark datasets in driving scientific progress and highlights our commitment to enhancing the integrity and reliability of computational evaluations in the Graph Drawing community.
5 Research Material Statements
The supplemental material for this publication includes:
- The Graph Benchmark Datasets website, hosted on GitHub Pages.
- Code for the website, metadata on the benchmark datasets, and data processing scripts are stored in the gd_benchmark_sets GitHub repository. TODO (CD 2024-03-08): clean up and add documentation for how to use. Esp. clean the
datafolder,readme.md, the random.ipynbfiles. AddLICENCEfile specifying ideally Apache 2.0 for code, and info toreadme.mdspecifying Apache 2.0 for code and CC BY 4.0 for data & metadata. - This code and metadata, along with the benchmark datasets themselves in several common graph file formats, are archived for long-term availability in the Benchmark datasets for graph drawing project on OSF. TODO (CD 2024-03-08): add the website code + data. This can be done by creating a registration on OSF. I linked the GitHub repo as a data source on OSF, but the registration is what copies it to OSF as an archived snapshot.
- The original database used for metadata collection and storage is available on Notion as Benchmark datasets.
6 Acknowledgements
This work was supported in part by the U.S. National Science Foundation (NSF) under award number CAREER IIS-1762268 and the Austrian Science Fund (FWF) [ESP 513-N].
8 License
This work is licensed under a Creative Commons Attribution 4.0 International License. All code is released under the Apache 2.0 License.
9 Conflict of Interest
The authors declare that there are no competing interests.
References
Adai, Alex T., Shailesh V. Date, Shannon Wieland, and Edward M. Marcotte. 2004. “LGL: Creating a Map of Protein Function with an Algorithm for Visualizing Very Large Biological Networks.” Journal of Molecular Biology 340 (1): 179–90. https://doi.org/https://doi.org/10.1016/j.jmb.2004.04.047.
Arleo, Alessio, Walter Didimo, Giuseppe Liotta, and Fabrizio Montecchiani. 2019. “A Distributed Multilevel Force-Directed Algorithm.” IEEE Transactions on Parallel and Distributed Systems 30 (4): 754–65. https://doi.org/10.1109/tpds.2018.2869805.
Bachmaier, Christian, Andreas Gleißner, and Andreas Hofmeier. 2012. “DAGmar: Library for DAGs.” https://www.infosun.fim.uni-passau.de/~chris/down/MIP-1202.pdf.
Barth, Wilhelm, Michael Jünger, and Petra Mutzel. 2002. “Simple and Efficient Bilayer Cross Counting.” In Graph Drawing, edited by Michael T. Goodrich and Stephen G. Kobourov, 130–41. Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/3-540-36151-0_13.
Bartolomeo, Sara di, Yixuan Zhang, Fangfang Sheng, and Cody Dunne. 2021. “Sequence Braiding: Visual Overviews of Temporal Event Sequences and Attributes.” IEEE Transactions on Visualization and Computer Graphics 27 (2): 1353–63. https://doi.org/10.1109/tvcg.2020.3030442.
Batagelj, Vladimir, and Andrej Mrvar. 2006. “Pajek Datasets.” In. http://vlado.fmf.uni-lj.si/pub/networks/data/.
Battista, Giuseppe Di, Ashim Garg, Giuseppe Liotta, Armando Parise, Roberto Tamassia, Emanuele Tassinari, Francesco Vargiu, and Luca Vismara. 2000. “Drawing Directed Acyclic Graphs: An Experimental Study.” International Journal of Computational Geometry & Applications 10 (06): 623–48. https://doi.org/10.1142/s0218195900000358.
Battista, Giuseppe Di, Ashim Garg, Giuseppe Liotta, Roberto Tamassia, Emanuele Tassinari, and Francesco Vargiu. 1997. “An Experimental Comparison of Four Graph Drawing Algorithms.” Computational Geometry 7 (5-6): 303–25. https://doi.org/10.1016/s0925-7721(96)00005-3.
Bekos, Michael A., Henry Förster, Christian Geckeler, Lukas Holländer, Michael Kaufmann, Amadäus M. Spallek, and Jan Splett. 2018. “A Heuristic Approach Towards Drawings of Graphs with High Crossing Resolution.” In Lecture Notes in Computer Science, 271–85. Springer International Publishing. https://doi.org/10.1007/978-3-030-04414-5_19.
Binucci, Carla, Markus Chimani, Walter Didimo, Giuseppe Liotta, and Fabrizio Montecchiani. 2016. “Placing Arrows in Directed Graph Drawings.” In Lecture Notes in Computer Science, 44–51. Springer International Publishing. https://doi.org/10.1007/978-3-319-50106-2_4.
Binucci, Carla, Walter Didimo, Michael Kaufmann, Giuseppe Liotta, and Fabrizio Montecchiani. 2022. “Placing Arrows in Directed Graph Layouts: Algorithms and Experiments.” Computer Graphics Forum 41 (1): 364–76. https://doi.org/https://doi.org/10.1111/cgf.14440.
Börsig, Katharina, Ulrik Brandes, and Barna Pasztor. 2020. “Stochastic Gradient Descent Works Really Well for Stress Minimization.” In Lecture Notes in Computer Science, 18–25. Springer International Publishing. https://doi.org/10.1007/978-3-030-68766-3_2.
Brandes, Ulrik, and Christian Pich. 2006. “Eigensolver Methods for Progressive Multidimensional Scaling of Large Data.” In Proceedings of the 14th International Conference on Graph Drawing, 42–53. GD’06. Berlin, Heidelberg: Springer-Verlag. https://doi.org/10.5555/1758612.1758620.
Buchheim, Christoph, Markus Chimani, Dietmar Ebner, Carsten Gutwenger, Michael Jünger, Gunnar W. Klau, Petra Mutzel, and René Weiskircher. 2008. “A Branch-and-Cut Approach to the Crossing Number Problem.” Discrete Optimization 5 (2): 373–88. https://doi.org/10.1016/j.disopt.2007.05.006.
Calamoneri, Tiziana, Valentino Di Donato, Diego Mariottini, and Maurizio Patrignani. 2018. “Visualizing Co-Phylogenetic Reconciliations.” In Lecture Notes in Computer Science, 334–47. Springer International Publishing. https://doi.org/10.1007/978-3-319-73915-1_27.
Chimani, Markus, and Carsten Gutwenger. 2012. “Advances in the Planarization Method: Effective Multiple Edge Insertions.” In Graph Drawing, 87–98. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-25878-7_10.
Chimani, Markus, Carsten Gutwenger, and Petra Mutzel. 2006. “Experiments on Exact Crossing Minimization Using Column Generation.” In Experimental Algorithms, edited by Carme Àlvarez and María Serna, 303–15. Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/11764298_28.
Chimani, Markus, Carsten Gutwenger, Petra Mutzel, and Hoi-Ming Wong. 2010. “Layer-Free Upward Crossing Minimization.” ACM J. Exp. Algorithmics 15 (March). https://doi.org/10.1145/1671970.1671975.
Chimani, Markus, Max Ilsen, and Tilo Wiedera. 2021. “Star-Struck by Fixed Embeddings: Modern Crossing Number Heuristics.” In Lecture Notes in Computer Science, 41–56. Springer International Publishing. https://doi.org/10.1007/978-3-030-92931-2_3.
Chimani, Markus, Karsten Klein, and Tilo Wiedera. 2016. “A Note on the Practicality of Maximal Planar Subgraph Algorithms.” In Graph Drawing and Network Visualization, edited by Yifan Hu and Martin Nöllenburg, 357–64. Cham: Springer International Publishing.
Chimani, Markus, Petra Mutzel, and Immanuel Bomze. 2008. “A New Approach to Exact Crossing Minimization.” In Algorithms - ESA 2008, 284–96. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-87744-8_24.
Chimani, Markus, and Tilo Wiedera. 2016. “An ILP-Based Proof System for the Crossing Number Problem.” In 24th Annual European Symposium on Algorithms (ESA 2016), edited by Piotr Sankowski and Christos Zaroliagis, 57:29:1–13. Leibniz International Proceedings in Informatics (LIPIcs). Dagstuhl, Germany: Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. https://doi.org/10.4230/LIPIcs.ESA.2016.29.
Civril, Ali, Malik Magdon-Ismail, and Eli Bocek-Rivele. 2006. “SDE: Graph Drawing Using Spectral Distance Embedding.” In Graph Drawing, edited by Patrick Healy and Nikola S. Nikolov, 512–13. Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/11618058_48.
Clancy, Kieran, Michael Haythorpe, and Alex Newcombe. 2019a. “A Survey of Graphs with Known or Bounded Crossing Numbers.” Australas. J Comb. 78: 209–96. https://api.semanticscholar.org/CorpusID:119313173.
———. 2019b. “An Effective Crossing Minimisation Heuristic Based on Star Insertion.” Journal of Graph Algorithms and Applications 23 (2): 135–66. https://doi.org/10.7155/jgaa.00487.
Dang, T. N., N. Pendar, and A. G. Forbes. 2016. “TimeArcs: Visualizing Fluctuations in Dynamic Networks.” Computer Graphics Forum 35 (3): 61–69. https://doi.org/10.1111/cgf.12882.
Davis, Timothy A., and Yifan Hu. 2011. “The University of Florida Sparse Matrix Collection.” ACM Trans. Math. Softw. 38 (1). https://doi.org/10.1145/2049662.2049663.
Demel, Almut, Dominik Dürrschnabel, Tamara Mchedlidze, Marcel Radermacher, and Lasse Wulf. 2018. “A Greedy Heuristic for Crossing-Angle Maximization.” In Lecture Notes in Computer Science, 286–99. Springer International Publishing. https://doi.org/10.1007/978-3-030-04414-5_20.
Di Bartolomeo, Sara, Tarik Crnovrsanin, David Saffo, Eduardo Puerta, Connor Wilson, and Cody Dunne. 2024. “Evaluating Graph Layout Algorithms: A Systematic Review of Methods and Best Practices.” Computer Graphics Forum n/a (n/a): e15073. https://doi.org/10.1111/cgf.15073.
Di Bartolomeo, Sara, Mirek Riedewald, Wolfgang Gatterbauer, and Cody Dunne. 2022. “STRATISFIMAL LAYOUT: A Modular Optimization Model for Laying Out Layered Node-Link Network Visualizations.” IEEE Transactions on Visualization and Computer Graphics 28 (1): 324–34. https://doi.org/10.1109/tvcg.2021.3114756.
Di Battista, Giuseppe, Ashim Garg, and Giuseppe Liotta. 1995. “An Experimental Comparison of Three Graph Drawing Algorithms (Extended Abstract).” In Proceedings of the Eleventh Annual Symposium on Computational Geometry, 306–15. SCG ’95. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/220279.220312.
Donati, Beatrice, Christian Baudet, Blerina Sinaimeri, Pierluigi Crescenzi, and Marie-France Sagot. 2015. “EUCALYPT: Efficient Tree Reconciliation Enumerator.” Algorithms for Molecular Biology 10: 3. https://doi.org/10.1186/s13015-014-0031-3.
Eades, Peter, Quan Nguyen, and Seok-Hee Hong. 2018. “Drawing Big Graphs Using Spectral Sparsification.” In Lecture Notes in Computer Science, 272–86. Springer International Publishing. https://doi.org/10.1007/978-3-319-73915-1_22.
Elzen, Stef van den, Danny Holten, Jorik Blaas, and Jarke J. van Wijk. 2013. “Reordering Massive Sequence Views: Enabling Temporal and Structural Analysis of Dynamic Networks.” In 2013 IEEE Pacific Visualization Symposium (PacificVis). IEEE. https://doi.org/10.1109/pacificvis.2013.6596125.
Frishman, Yaniv, and Ayellet Tal. 2007. “Multi-Level Graph Layout on the GPU.” IEEE Transactions on Visualization and Computer Graphics 13 (6): 1310–19. https://doi.org/10.1109/TVCG.2007.70580.
Frishman, Y., and A. Tal. 2008. “Online Dynamic Graph Drawing.” IEEE Transactions on Visualization and Computer Graphics 14 (4): 727–40. https://doi.org/10.1109/tvcg.2008.11.
Gange, Graeme, Peter J. Stuckey, and Kim Marriott. 2011. “Optimal k-Level Planarization and Crossing Minimization.” In Graph Drawing, 238–49. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-18469-7_22.
Gansner, E. R., Yifan Hu, and S. North. 2013. “A Maxent-Stress Model for Graph Layout.” IEEE Transactions on Visualization and Computer Graphics 19 (6): 927–40. https://doi.org/10.1109/tvcg.2012.299.
Gansner, E. R., Y. Koren, and S. C. North. 2005. “Topological Fisheye Views for Visualizing Large Graphs.” IEEE Transactions on Visualization and Computer Graphics 11 (4): 457–68. https://doi.org/10.1109/tvcg.2005.66.
Gansner, Emden R., and Yehuda Koren. 2006. “Improved Circular Layouts.” In Graph Drawing, 386–98. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-70904-6_37.
Gansner, Emden R., and Stephen C. North. 2000. “An Open Graph Visualization System and Its Applications to Software Engineering.” Softw. Pract. Exper. 30 (11): 1203–33.
Giovannangeli, Loann, Frederic Lalanne, Romain Giot, and Romain Bourqui. 2023. “FORBID: Fast Overlap Removal by Stochastic GradIent Descent for Graph Drawing.” In Graph Drawing and Network Visualization, edited by Patrizio Angelini and Reinhard von Hanxleden, 61–76. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-031-22203-0_6.
Gray, Kathryn, Mingwei Li, Reyan Ahmed, and Stephen Kobourov. 2023. “Visualizing Evolving Trees.” In Graph Drawing and Network Visualization, edited by Patrizio Angelini and Reinhard von Hanxleden, 319–35. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-031-22203-0_23.
Gronemann, Martin, Michael Jünger, Frauke Liers, and Francesco Mambelli. 2016. “Crossing Minimization in Storyline Visualization.” In Lecture Notes in Computer Science, 367–81. Springer International Publishing. https://doi.org/10.1007/978-3-319-50106-2_29.
Gutwenger, Carsten. 2010. “Application of SPQR-Trees in the Planarization Approach for Drawing Graphs.” PhD thesis, Technische Universität Dortmund. https://eldorado.tu-dortmund.de/handle/2003/27430?mode=full.
Gutwenger, Carsten, and Petra Mutzel. 2004. “An Experimental Study of Crossing Minimization Heuristics.” In Graph Drawing, 13–24. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-24595-7_2.
Hachul, Stefan, and Michael Juenger. 2006. “An Experimental Comparison of Fast Algorithms for Drawing General Large Graphs,” 235–50. https://doi.org/10.1007/11618058_22.
Hachul, Stefan, and Michael Jünger. 2005. “Drawing Large Graphs with a Potential-Field-Based Multilevel Algorithm.” In Graph Drawing, 285–95. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-31843-9_29.
Heinsohn, Niklas, and Michael Kaufmann. 2018. “An Interactive Tool to Explore and Improve the Ply Number of Drawings.” In Lecture Notes in Computer Science, 38–51. Springer International Publishing. https://doi.org/10.1007/978-3-319-73915-1_4.
Hong, Seok-Hee, Peter Eades, Marnijati Torkel, Ziyang Wang, David Chae, Sungpack Hong, Daniel Langerenken, and Hassan Chafi. 2019. “Multi-Level Graph Drawing Using Infomap Clustering.” In Lecture Notes in Computer Science, 139–46. Springer International Publishing. https://doi.org/10.1007/978-3-030-35802-0_11.
Jabrayilov, Adalat, Sven Mallach, Petra Mutzel, Ulf Rüegg, and Reinhard von Hanxleden. 2016. “Compact Layered Drawings of General Directed Graphs.” In Lecture Notes in Computer Science, 209–21. Springer International Publishing. https://doi.org/10.1007/978-3-319-50106-2_17.
Jolly, Mitchell. 2017. “Chess Game Dataset (Lichess).” Kaggle. https://www.kaggle.com/datasets/datasnaek/chess?resource=download.
Jünger, Michael, and Petra Mutzel. 1996. “Exact and Heuristic Algorithms for 2-Layer Straightline Crossing Minimization.” In Graph Drawing, edited by Franz J. Brandenburg, 337–48. Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/BFb0021817.
Jünger, Michael, Petra Mutzel, and Christiane Spisla. 2018. “A Flow Formulation for Horizontal Coordinate Assignment with Prescribed Width.” In Lecture Notes in Computer Science, 187–99. Springer International Publishing. https://doi.org/10.1007/978-3-030-04414-5_13.
Kindermann, Philipp, Wouter Meulemans, and André Schulz. 2018. “Experimental Analysis of the Accessibility of Drawings with Few Segments.” Journal of Graph Algorithms and Applications 22 (3): 501–18. https://doi.org/10.7155/jgaa.00474.
Klammler, Moritz, Tamara Mchedlidze, and Alexey Pak. 2018. “Aesthetic Discrimination of Graph Layouts.” In Graph Drawing and Network Visualization, edited by Therese Biedl and Andreas Kerren, 169–84. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-04414-5_12.
Kleinberg, Jon. 2002. “The Structure of Information Networks.” https://www.cs.cornell.edu/courses/cs685/2002fa/.
Klimt, Bryan, and Yiming Yang. 2004. “Introducing the Enron Corpus.” In International Conference on Email and Anti-Spam.
Knuth, Donald Ervin. 2009. The Stanford Graphbase: A Platform for Combinatorial Computing. Addison-Wesley.
Koren, Y. 2005. “Drawing Graphs by Eigenvectors: Theory and Practice.” Computers & Mathematics with Applications 49 (11): 1867–88. https://doi.org/10.1016/j.camwa.2004.08.015.
Koren, Yehuda, L. Carmel, and D. Harel. 2002. “ACE: A Fast Multiscale Eigenvectors Computation for Drawing Huge Graphs.” In IEEE Symposium on Information Visualization, 2002. INFOVIS 2002., 137–44. https://doi.org/10.1109/INFVIS.2002.1173159.
Kruiger, J. F., P. E. Rauber, R. M. Martins, A. Kerren, S. Kobourov, and A. C. Telea. 2017. “Graph Layouts by t-SNE.” Computer Graphics Forum 36 (3): 283–94. https://doi.org/10.1111/cgf.13187.
Leskovec, Jure, and Andrej Krevl. 2014. “SNAP Datasets: Stanford Large Network Dataset Collection.” http://snap.stanford.edu/data.
Letunic, Ivica, and Peer Bork. 2021. “Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation.” Nucleic Acids Research 49 (W1): W293–96. https://doi.org/10.1093/nar/gkab301.
Maddison, David, Katja Schulz, and Wayne Maddison. 2007. “The Tree of Life Web Project*.” Zootaxa 1668 (December): 19–40. https://doi.org/10.11646/zootaxa.1668.1.4.
Mallach, Sven. 2019. “A Natural Quadratic Approach to the Generalized Graph Layering Problem.” In Lecture Notes in Computer Science, 532–44. Springer International Publishing. https://doi.org/10.1007/978-3-030-35802-0_40.
Malone, D. W. 1975. “An Introduction to the Application of Interpretive Structural Modeling.” Proceedings of the IEEE 63 (3): 397–404. https://doi.org/10.1109/PROC.1975.9765.
Meidiana, Amyra, Seok-Hee Hong, and Peter Eades. 2023. “Shape-Faithful Graph Drawings.” In Graph Drawing and Network Visualization, edited by Patrizio Angelini and Reinhard von Hanxleden, 93–108. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-031-22203-0_8.
Meidiana, Amyra, Seok-Hee Hong, Peter Eades, and Daniel Keim. 2019. “A Quality Metric for Visualization of Clusters in Graphs.” In Lecture Notes in Computer Science, 125–38. Springer International Publishing. https://doi.org/10.1007/978-3-030-35802-0_10.
Meidiana, Amyra, Seok-Hee Hong, Marnijati Torkel, Shijun Cai, and Peter Eades. 2020. “Sublinear Time Force Computation for Big Complex Network Visualization.” Computer Graphics Forum 39 (3): 579–91. https://doi.org/https://doi.org/10.1111/cgf.14003.
Miller, Jacob, Vahan Huroyan, and Stephen Kobourov. 2023. “Spherical Graph Drawing by Multi-Dimensional Scaling.” In Graph Drawing and Network Visualization, edited by Patrizio Angelini and Reinhard von Hanxleden, 77–92. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-031-22203-0_7.
Muelder, Chris, and Kwan Liu Ma. 2008a. “Rapid Graph Layout Using Space Filling Curves.” IEEE Transactions on Visualization and Computer Graphics 14 (6): 1301–8. https://doi.org/10.1109/TVCG.2008.158.
Muelder, Chris, and Kwan-Liu Ma. 2008b. “A Treemap Based Method for Rapid Layout of Large Graphs.” In 2008 IEEE Pacific Visualization Symposium, 231–38. https://doi.org/10.1109/PACIFICVIS.2008.4475481.
Nachmanson, Lev, Arlind Nocaj, Sergey Bereg, Leishi Zhang, and Alexander Holroyd. 2017. “Node Overlap Removal by Growing a Tree.” Journal of Graph Algorithms and Applications 21 (5): 857–72. https://doi.org/10.7155/jgaa.00442.
Nickel, Soeren, Max Sondag, Wouter Meulemans, Markus Chimani, Stephen Kobourov, Jaakko Peltonen, and Martin Nöllenburg. 2019. “Computing Stable Demers Cartograms.” In Lecture Notes in Computer Science, 46–60. Springer International Publishing. https://doi.org/10.1007/978-3-030-35802-0_4.
Niedermann, Benjamin, and Ignaz Rutter. 2020. “An Integer-Linear Program for Bend-Minimization in Ortho-Radial Drawings.” In Lecture Notes in Computer Science, 235–49. Springer International Publishing. https://doi.org/10.1007/978-3-030-68766-3_19.
Noack, Andreas. 2004. “An Energy Model for Visual Graph Clustering.” In Graph Drawing, edited by Giuseppe Liotta, 425–36. Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-24595-7_40.
Nocaj, Arlind, Mark Ortmann, and Ulrik Brandes. 2015. “Untangling the Hairballs of Multi-Centered, Small-World Online Social Media Networks.” Journal of Graph Algorithms and Applications : JGAA 19 (2): 595–618. https://doi.org/10.7155/jgaa.00370.
Ortmann, Mark, Mirza Klimenta, and Ulrik Brandes. 2017. “A Sparse Stress Model.” Journal of Graph Algorithms and Applications 21 (5): 791–821. https://doi.org/10.7155/jgaa.00440.
Raj, Mukund, and Ross T. Whitaker. 2018. “Anisotropic Radial Layout for Visualizing Centrality and Structure in Graphs.” In Lecture Notes in Computer Science, 351–64. Springer International Publishing. https://doi.org/10.1007/978-3-319-73915-1_28.
Rossi, Ryan A., and Nesreen K. Ahmed. 2015. “The Network Data Repository with Interactive Graph Analytics and Visualization.” In AAAI. http://networkrepository.com.
Rüegg, Ulf, Thorsten Ehlers, Miro Spönemann, and Reinhard von Hanxleden. 2016. “A Generalization of the Directed Graph Layering Problem.” In Lecture Notes in Computer Science, 196–208. Springer International Publishing. https://doi.org/10.1007/978-3-319-50106-2_16.
Soper, A. J., C. Walshaw, and M. Cross. 2004. “A Combined Evolutionary Search and Multilevel Optimisation Approach to Graph-Partitioning.” Journal of Global Optimization 29 (2): 225–41. https://doi.org/10.1023/B:JOGO.0000042115.44455.f3.
Subramanian, Arvind, and Shang-Jin Wei. 2007. “The WTO Promotes Trade, Strongly but Unevenly.” Journal of International Economics 72 (1): 151–75. https://doi.org/https://doi.org/10.1016/j.jinteco.2006.07.007.
Sugiyama, Kozo, Shojiro Tagawa, and Mitsuhiko Toda. 1981. “Methods for Visual Understanding of Hierarchical System Structures.” IEEE Transactions on Systems, Man, and Cybernetics 11 (2): 109–25. https://doi.org/10.1109/tsmc.1981.4308636.
Tanahashi, Yuzuru, Chien-Hsin Hsueh, and Kwan-Liu Ma. 2015. “An Efficient Framework for Generating Storyline Visualizations from Streaming Data.” IEEE Transactions on Visualization and Computer Graphics 21 (6): 730–42. https://doi.org/10.1109/tvcg.2015.2392771.
Walshaw, C. 2001. “A Multilevel Algorithm for Force-Directed Graph Drawing.” In Graph Drawing, edited by Joe Marks, 171–82. Berlin, Heidelberg: Springer Berlin Heidelberg.
Walshaw, C., and M. Cross. 2000. “Mesh Partitioning: A Multilevel Balancing and Refinement Algorithm.” SIAM J. Sci. Comput. 22 (1): 63–80. https://doi.org/10.1137/S1064827598337373.
Welch, E., and S. Kobourov. 2017. “Measuring Symmetry in Drawings of Graphs.” Computer Graphics Forum 36 (3): 341–51. https://doi.org/https://doi.org/10.1111/cgf.13192.
Zarate, David Cheng, Pierre Le Bodic, Tim Dwyer, Graeme Gange, and Peter Stuckey. 2018. “Optimal Sankey Diagrams via Integer Programming.” In 2018 IEEE Pacific Visualization Symposium (PacificVis), 135–39. https://doi.org/10.1109/PacificVis.2018.00025.
Zheng, Jonathan X., Samraat Pawar, and Dan F. M. Goodman. 2019. “Graph Drawing by Stochastic Gradient Descent.” IEEE Transactions on Visualization and Computer Graphics 25 (9): 2738–48. https://doi.org/10.1109/TVCG.2018.2859997.
Zhong, Fahai, Mingliang Xue, Jian Zhang, Fan Zhang, Rui Ban, Oliver Deussen, and Yunhai Wang. 2023. “Force-Directed Graph Layouts Revisited: A New Force Based on the t-Distribution.” IEEE Transactions on Visualization and Computer Graphics, 1–14. https://doi.org/10.1109/TVCG.2023.3238821.
Zhu, Minfeng, Wei Chen, Yuanzhe Hu, Yuxuan Hou, Liangjun Liu, and Kaiyuan Zhang. 2021. “DRGraph: An Efficient Graph Layout Algorithm for Large-Scale Graphs by Dimensionality Reduction.” IEEE Transactions on Visualization and Computer Graphics 27 (2): 1666–76. https://doi.org/10.1109/TVCG.2020.3030447.